Monitoring changes inside a reservoir in real time is crucial for the success of CO2 injection and long-term storage. Machine learning (ML) is well-suited for real-time CO2 monitoring because of its computational efficiency. However, most existing applications of ML yield only one prediction (i.e., the expectation) for a given input, which may not properly reflect the distribution of the testing data, if it has a shift with respect to that of the training data. The Simultaneous Quantile Regression (SQR) method can estimate the entire conditional distribution of the target variable of a neural network via pinball loss. Here, we incorporate this technique into seismic inversion for purposes of CO2 monitoring. The uncertainty map is then calculated pixel by pixel from a particular prediction interval around the median. We also propose a novel data-augmentation method by sampling the uncertainty to further improve prediction accuracy. The developed methodology is tested on synthetic Kimberlina data, which are created by the Department of Energy and based on a CO2 capture and sequestration (CCS) project in California. The results prove that the proposed network can estimate the subsurface velocity rapidly and with sufficient resolution. Furthermore, the computed uncertainty quantifies the prediction accuracy. The method remains robust even if the testing data are distorted due to problems in the field data acquisition. Another test demonstrates the effectiveness of the developed data-augmentation method in increasing the spatial resolution of the estimated velocity field and in reducing the prediction error.
translated by 谷歌翻译
在对地下地震成像的研究中,求解声波方程是现有模型中的关键成分。随着深度学习的发展,神经网络通过学习输入和方程解决方案之间的映射,特别是波动方程式,将神经网络应用于数值求解部分微分方程,因为如果要花很多时间,传统方法可能会很耗时解决了。以前专注于通过神经网络解决波动方程的工作考虑单个速度模型或多个简单速度模型,这在实践中受到限制。因此,受操作员学习的构想的启发,这项工作利用了傅立叶神经操作员(FNO)在可变速度模型的背景下有效地学习频域地震波场。此外,我们提出了一个与傅立叶神经操作员(PFNO)并行的新框架,以有效地训练基于FNO的求解器,给定多个源位置和频率。数值实验证明了OpenFWI数据集中使用复杂速度模型的FNO和PFNO的高精度。此外,跨数据集泛化测试验证了PFNO适应过分速度模型的。同样,在标签中存在随机噪声的情况下,PFNO具有强大的性能。最后,与传统的有限差异方法相比,PFNO在大规模测试数据集上接受了更高的计算效率。上述优势赋予了基于FNO的求解器的潜力,可以为地震波研究建立强大的模型。
translated by 谷歌翻译
模型压缩(例如修剪和量化)已广泛应用于在资源有限的经典设备上优化神经网络。最近,对变分量子电路(VQC)的兴趣越来越大,即量子计算机上的一种神经网络(又称量子神经网络)。众所周知,近期的量子设备具有高噪声和有限的资源(即量子位,Qubits);但是,如何压缩量子神经网络尚未得到彻底研究。人们可能会认为将经典压缩技术应用于量子场景是很简单的。但是,本文表明,量子和经典神经网络的压缩之间存在差异。根据我们的观察,我们声称必须参与压缩过程。最重要的是,我们提出了第一个系统的框架,即CompVQC,以压缩量子神经网络(QNNS)。在CompVQC中,关键组件是一种新型的压缩算法,该算法基于乘数的交替方向方法(ADMM)。方法。实验证明了COMPVQC的优势,以微不足道的精度下降(<1%)降低了电路深度(几乎超过2.5%),这表现优于其他竞争对手。另一个有前途的事实是,我们的COMPVQC确实可以促进QNN在近期噪声量子设备上的鲁棒性。
translated by 谷歌翻译
反转技术被广泛用于重建基于表面的地球物理测量值(例如,地震,电气/磁(EM)数据)的地下物理特性(例如,速度,电导率)。这些问题受波浪或麦克斯韦方程等部分微分方程(PDE)的控制。解决地球物理反演问题由于不适当和高计算成本而具有挑战性。为了减轻这些问题,最近的研究利用深层神经网络来学习从测量到物业的倒置映射。在本文中,我们表明,这样的映射可以通过仅有五层的非常浅(但不是宽)网络来很好地建模。这是基于我们对有趣属性的新发现来实现的:在高维空间中应用积分变换后,输入和输出之间的近乎线性关系。特别是,在处理由波方程控制的从地震数据到地下速度的反演时,与高斯核的速度的积分结果与正弦核的地震数据的积分线性相关。此外,该属性可以轻松地转变为用于反转的轻质编码器网络。编码器包含地震数据和线性转换的整合,而无需进行微调。解码器仅由一个单个变压器块组成,以逆转速度的积分。实验表明,这种有趣的属性可用于四个不同数据集的两个地球物理倒置问题。与更深的倒置网络相比,我们的方法达到了可比的精度,但消耗的参数大大减少。
translated by 谷歌翻译
数据驱动方法已被证明是解决复杂科学问题的有希望的技术。全波形反转(FWI)通常被阐述为图像到图像转换任务,这激励了深度神经网络作为端到端解决方案的使用。尽管采用了合成数据培训,但在用足够的真实数据评估时,深度学习驱动的FWI预计将表现良好。在本文中,我们通过询问研究此类属性:这些深度神经网络的强大是如何发展以及它们如何概括?对于稳健性,我们证明了从清洁和嘈杂数据之间预测之间的偏差的上限。此外,我们展示了噪声水平与额外损失增益之间的相互作用。对于泛化,我们通过稳定性泛化框架证明了基于常规的泛化误差。地震FWI数据集与理论结果的实验​​结果,揭示了利用深度学习对复杂的科学应用的影响。
translated by 谷歌翻译
我们展示了OpenFWI,是用于地震全波形反演(FWI)的大型开源基准数据集的集合。OpenFWI是地球科学和机器学习界的一流,以促进对基于机器学习的FWI多元化,严谨和可重复的研究。OpenFWI包括多个尺度的数据集,包含不同的域,涵盖各种级别的模型复杂性。除了数据集之外,我们还对每个数据集进行实证研究,具有完全卷积的深度学习模型。OpenFWI已被核心维护,并将通过新数据和实验结果定期更新。我们感谢社区的投入,帮助我们进一步改进OpenFWI。在当前版本,我们在OpenFWI中发布了七个数据集,其中为3D FWI指定了一个,其余的是2D场景。所有数据集和相关信息都可以通过我们的网站访问https://openfwi.github.io/。
translated by 谷歌翻译
Unsupervised domain adaptation (UDA) for semantic segmentation is a promising task freeing people from heavy annotation work. However, domain discrepancies in low-level image statistics and high-level contexts compromise the segmentation performance over the target domain. A key idea to tackle this problem is to perform both image-level and feature-level adaptation jointly. Unfortunately, there is a lack of such unified approaches for UDA tasks in the existing literature. This paper proposes a novel UDA pipeline for semantic segmentation that unifies image-level and feature-level adaptation. Concretely, for image-level domain shifts, we propose a global photometric alignment module and a global texture alignment module that align images in the source and target domains in terms of image-level properties. For feature-level domain shifts, we perform global manifold alignment by projecting pixel features from both domains onto the feature manifold of the source domain; and we further regularize category centers in the source domain through a category-oriented triplet loss and perform target domain consistency regularization over augmented target domain images. Experimental results demonstrate that our pipeline significantly outperforms previous methods. In the commonly tested GTA5$\rightarrow$Cityscapes task, our proposed method using Deeplab V3+ as the backbone surpasses previous SOTA by 8%, achieving 58.2% in mIoU.
translated by 谷歌翻译
Compressed videos often exhibit visually annoying artifacts, known as Perceivable Encoding Artifacts (PEAs), which dramatically degrade video visual quality. Subjective and objective measures capable of identifying and quantifying various types of PEAs are critical in improving visual quality. In this paper, we investigate the influence of four spatial PEAs (i.e. blurring, blocking, bleeding, and ringing) and two temporal PEAs (i.e. flickering and floating) on video quality. For spatial artifacts, we propose a visual saliency model with a low computational cost and higher consistency with human visual perception. In terms of temporal artifacts, self-attention based TimeSFormer is improved to detect temporal artifacts. Based on the six types of PEAs, a quality metric called Saliency-Aware Spatio-Temporal Artifacts Measurement (SSTAM) is proposed. Experimental results demonstrate that the proposed method outperforms state-of-the-art metrics. We believe that SSTAM will be beneficial for optimizing video coding techniques.
translated by 谷歌翻译
Image Virtual try-on aims at replacing the cloth on a personal image with a garment image (in-shop clothes), which has attracted increasing attention from the multimedia and computer vision communities. Prior methods successfully preserve the character of clothing images, however, occlusion remains a pernicious effect for realistic virtual try-on. In this work, we first present a comprehensive analysis of the occlusions and categorize them into two aspects: i) Inherent-Occlusion: the ghost of the former cloth still exists in the try-on image; ii) Acquired-Occlusion: the target cloth warps to the unreasonable body part. Based on the in-depth analysis, we find that the occlusions can be simulated by a novel semantically-guided mixup module, which can generate semantic-specific occluded images that work together with the try-on images to facilitate training a de-occlusion try-on (DOC-VTON) framework. Specifically, DOC-VTON first conducts a sharpened semantic parsing on the try-on person. Aided by semantics guidance and pose prior, various complexities of texture are selectively blending with human parts in a copy-and-paste manner. Then, the Generative Module (GM) is utilized to take charge of synthesizing the final try-on image and learning to de-occlusion jointly. In comparison to the state-of-the-art methods, DOC-VTON achieves better perceptual quality by reducing occlusion effects.
translated by 谷歌翻译
Panoptic Part Segmentation (PPS) unifies panoptic segmentation and part segmentation into one task. Previous works utilize separated approaches to handle thing, stuff, and part predictions without shared computation and task association. We aim to unify these tasks at the architectural level, designing the first end-to-end unified framework named Panoptic-PartFormer. Moreover, we find the previous metric PartPQ biases to PQ. To handle both issues, we make the following contributions: Firstly, we design a meta-architecture that decouples part feature and things/stuff feature, respectively. We model things, stuff, and parts as object queries and directly learn to optimize all three forms of prediction as a unified mask prediction and classification problem. We term our model as Panoptic-PartFormer. Secondly, we propose a new metric Part-Whole Quality (PWQ) to better measure such task from both pixel-region and part-whole perspectives. It can also decouple the error for part segmentation and panoptic segmentation. Thirdly, inspired by Mask2Former, based on our meta-architecture, we propose Panoptic-PartFormer++ and design a new part-whole cross attention scheme to further boost part segmentation qualities. We design a new part-whole interaction method using masked cross attention. Finally, the extensive ablation studies and analysis demonstrate the effectiveness of both Panoptic-PartFormer and Panoptic-PartFormer++. Compared with previous Panoptic-PartFormer, our Panoptic-PartFormer++ achieves 2% PartPQ and 3% PWQ improvements on the Cityscapes PPS dataset and 5% PartPQ on the Pascal Context PPS dataset. On both datasets, Panoptic-PartFormer++ achieves new state-of-the-art results with a significant cost drop of 70% on GFlops and 50% on parameters. Our models can serve as a strong baseline and aid future research in PPS. Code will be available.
translated by 谷歌翻译